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We provide a limit theory for a general class of kernel smoothed 
U-statistics that may be used for specification testing in time series 
regression with nonstationary data. The test framework allows for 
linear and nonlinear models with endogenous regressors that have 
autoregressive unit roots or near unit roots. The limit theory for 
the specification test depends on the self-intersection local time of 
a Gaussian process. A new weak convergence result is developed for 
certain partial sums of functions involving nonstationary time series 
that converges to the intersection local time process. This result is of 
independent interest and is useful in other applications. Simulations 
examine the finite sample performance of the test. 



1. Introduction. One of the advantages of nonparametric modeling is 
the opportunity for specification testing of particular parametric models 
against general alternatives. The past three decades have witnessed many 
developments in such specification tests involving nonparametric and semi- 
parametric techniques that allow for independent, short memory and long- 
range dependent data. Recent research on the nonparametric modeling of 
nonstationary data opens up some new possibilities that seem relevant to 
applications in many fields, including nonlinear diffusion models in contin- 
uous time [Bandi and Phillips (2003, 2007)] and cointegration models in 
economics and finance. 

Cointegration models were originally developed in a linear parametric 
framework that has been widely used in econometric applications. That 
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framework was extended in Park and Phillips (1999, 2001) to allow for non- 
linear parametric formulations under certain restrictions on the function 
nonlinearity. While considerably broadening the class of allowable nonsta- 
tionary models, the potential for parametric misspecification in these models 
is still present and is important to test in applied work. 

The hypothesis of linear cointegration is of particular interest in this con- 
text, given the vast empirical literature. Recent papers by Karlsen, Myk- 
lebust and Tj0stheim (2007), Wang and Phillips (2009a, 2009b, 2011) and 
Schienle (2008) have developed asymptotic theory for nonparametric kernel 
regression of nonlinear nonstationary systems. This work facilitates the com- 
parison of various parametric specifications against a more general nonpara- 
metric nonlinear alternative. Such comparisons may be based on weighted 
sums of squared differences between the parametric and nonparametric esti- 
mates of the system or on a kernel-based U-statistic test which uses 
a smoothed version of the parametric estimator in its construction [e.g., 
Gao (2007), Chapter 3]. 

A major obstacle in the development of such specification tests is the tech- 
nical difficulty of developing a limit theory for these weighted sums which 
typically involve kernel functions with multiple nonstationary regressor ar- 
guments. Few results are currently available, and because of this shortage, 
attempts to develop specification tests for nonlinear regression models with 
nonstationarity have been highly specific and do not involve nonparametric 
alternatives or kernel methods. Some examples of recent work in parametric 
models include Choi and Saikonnen (2004, 2010), Marmer (2008), Hong and 
Phillips (2010) and Kasparis and Phillips (2012). An exception is the recent 
work for testing linearity in autoregression and parametric time series regres- 
sion by Gao et al. (2009a, 2009b) who obtained a limit distribution theory 
for a kernel based specification test in a setting that involves martingale 
difference errors and random walk regressors. 

The present paper makes a related contribution and seeks to provide 
a general theory of specification tests that is applicable for a wider class 
of nonstationary regressors that includes both unit root and near unit root 
processes. The latter are important in practical work where a unit root 
restriction is deemed too restrictive. The paper contributes to this emerging 
literature in two ways. First, we provide a limit theory for a general class 
of kernel-based specification tests of parametric nonlinear regression models 
that allows for near unit root processes driven by short memory (linear 
process) errors. This limit theory should be widely applicable to specification 
testing in nonlinear cointegrated systems. 

Second, the limit theory of the specification test involves the self-intersec- 
tion local time of a Gaussian limit process. The result requires establishing 
weak convergence to this self-intersection local time process, which is of inde- 
pendent interest, and a feasible central limit theorem involving an empirical 
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estimator of the intersection local time that can be used to construct the test 
statistic. Thus, the results provide some new theories for intersection local 
time, weak convergence and specification test asymptotics that are relevant 
in applications. 

The paper is organized as follows. Section 2 lays out the nonpar ametric 
and parametric models and assumptions. Section 3 gives the main results on 
specification test limit theory. Section 4 reports some simulation evidence on 
test performance. Section 5 provides the weak convergence theory for inter- 
section local time. Section 6 gives proofs of the main theorems in Section 3. 
The proofs of the local time limit theory in Section 5 and some supplemental 
technical results in Section 6 can be found in the supplementary material 
[Wang and Phillips (2012)]. 

2. Model and assumptions. We consider the nonlinear cointegrating re- 
gression model 

(2.1) y t+ i = f(x t ) + u t+ i, t = l,2,...,n, 

where ut is a stationary error process, and Xt is a nonstationary regressor. 
We are interested in testing the null hypothesis 

H o :f(x) = f(x,0), een , 

for x G R, where f(x,6) is a given real function indexed by a vector 9 of 
unknown parameters which lie in the parameter space Qo • 

To test Hq we make use of the following kernel-smoothed test statistic: 

n 

(2.2) S n = u t+ iu s+1 K[(x t -x s )/h], 

s,t=l,s^t 

involving the parametric regression residuals ut+i = Vt+i — f( x t,8), where 
K(x) is a nonnegative real kernel function, h is a bandwidth satisfying h = 
h n — > as the sample size n — > oo and 9 is a parametric estimator of 9 under 
the null Hq, that is consistent whenever 9 G f^o- 

The statistic S n in (2.2) has commonly been applied to test paramet- 
ric specifications in stationary time series regression [see Gao (2007)] and 
was used by Gao et al. (2009a, 2009b) to test for linearity in autoregres- 
sion and a parametric conditional mean function in time series regression 
involving a random walk regressor. S n is a weighted U-statistic with kernel 
weights that depend on standardized differentials {xt — x s )/h of the regres- 
sor. The weights focus attention in the statistic on those components in 
the sum where the nonstationary regressor x± nearly intersects itself. This 
smoothing scheme gives prominence to product components Ut+\u s +i in the 
sum where s and t may differ considerably but for which the corresponding 
regressor process takes similar values (i.e., xt,x s ~x for some x), thereby 
enabling a test of Hq. 
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The difficulty in the development of an asymptotic theory for S n stems 
from the presence of the kernel weights K((xt — x s )/h). The behavior of 
these weights depends on the self intersection properties of xt in the sam- 
ple, and, as n — > oo, this translates into the corresponding properties of 
the stochastic process to which a standardized version of xt converges. To 
establish asymptotics for S n , we need to account for this limit behavior, 
which leads to a new limit theory involving the self-intersection local time 
of a Gaussian process (i.e., the local time for which the process intersects 
itself). 

We use the following assumptions in our development. 

Assumption 1. (i) {e<}t 6 z is a sequence of independent and identically 
distributed (i.i.d.) continuous random variables with Eeo = 0, Ee$ = 1, and 
with the characteristic function (p(t) of eo satisfying \ t\ \(f(t)\ — > 0, as \t\ — > oo. 

(ii) 

(2.3) x t = px t ^i + r] t , x = 0, p = l + K/n, l<t<n, 

where k is a constant and rj t = YH^o^k^t-k with <f> = Y^k=o^k 7^ an d 
Yl^o k 1+S \4>k\ < oo for some 6 > 0. 

Assumption 2. (i) {ut, Ft^tyi-, where Ft is a sequence of increasing 
(T-fields which is independent of e^,k > t + 1, forms a martingale difference 
satisfying E(itf +1 \ Ft) — > a .s. °" 2 > as t — > oo and sup t>1 E{\ut+\\ A \ Ft) < oo. 

(ii) x t is adapted to Ft, and there exists a correlated vector Brownian 
motion (W, V) such that 

/ , [nt] [nt] \ 

(2.4) £ 6i * £u, +1 ^ D (W(t),V(t)) 

on D[0, l] 2 as n — > oo. 

Assumption 3. K{x) is a nonnegative real function satisfying 
sup x K(x) < oo and J K(x) dx < oo. 

Assumption 4. (i) There is a sequence of positive real numbers 5 n sat- 
isfying 5 n — > as n — > oo such that sup eg Q ||^ — 6\\ = op(5 n ), where || • || 
denotes the Euclidean norm. q2 

(ii) There exists some eo > such that — ^f' - is continuous in both 
x £ R and t G 9 , where 9 = {t : \\t - 9\\ <e o ,0£ Q }- 

(iii) Uniformly for 9 G Qq, 



df(x,t) 




+ 


d 2 f(x,t) 




dt 


t=e 


dt 2 


t=e 



<C(l + \xf) 



for some constants (3 > and C > 0. 
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(iv) Uniformly for £ S7o, there exist < 7' < 1 and max{ 0,3/4 — 2/3} < 
7 < 1 such that 



for any x,y £ R, where g(x,t) = g t ' . 

Assumption 5. nh 2 — > 00, 5 2 n 1+fS \fh ->■ and nh 4 log 2 n — > 0, where /3 
and 5 2 are defined as in Assumption 4. Also, J(l + \x\ 2l3+1 ) K (x) dx < 00 



Assumption 1 allows for both a unit root (k = 0) and a near unit root 
(k 7^ 0) regressor by virtue of the localizing coefficient k and is standard 
in the near integrated regression framework [Phillips (1987, 1988), Chan 
and Wei (1987)]. Compared to the estimation theory developed in Wang 
and Phillips (2009a, 2009b) and for technical convenience in the present 
work, we impose the stronger summability condition ^fcLo < 00 

for some 5 > on the coefficients of the linear process r/ t = Yl^kLo&ktt-k 
driving the regressor xt- Under these conditions, it is well known that the 
standardized process x\ n ^ n = xr nt i/ '^/ncj) converges weakly to the Gaussian 

process G{t) = Jj e< 1 -^ dW{s), where W(t) is a standard Brownian motion. 
See (5.2) below or Phillips and Solo (1992). 

Assumption 2(i) is a standard martingale difference condition on the equa- 
tion innovations uj, so that cov(ut+i,xt) = E[x t E(ut+i \ J-'t)] = 0. Wang and 
Phillips (2009b) allowed for endogeneity in their nonparametric structure, so 
the equation error could be serially dependent and cross-correlated with xt 
for \t — s\ < m,Q for some finite mo- It is not clear at the moment if the results 
of the present paper on testing extend to the more general error structure 
considered in Wang and Phillips (2009b), but simulation results suggest that 
this may be so. Assumption 2(ii) is a standard functional law for partial sum 
processes [e.g., Park and Phillips (2001)]. 

Assumption 3 is a standard condition on K(x) as in the stationary sit- 
uation. The integrability condition is weaker than the common alternative 
requirement that K{x) has compact support. 

As seen in Assumption 5, the sequence 5 n in Assumption 4(i) may be cho- 
sen as 5 2 = n~ 

(i+/3)/2^-l/8_ As h —> and k = in (2.3), Assumption 4(i) 
holds under very general conditions, such as those of Theorem 5.2 in Park 
and Phillips (2001). Indeed, by Park and Phillips (2001), we may choose 6 
such that sup eg Q 1|$ — #|| = Op(n"' 1+ ' 3 ^ 2 ), under our Assumption 4(ii)-(iv). 
Assumption 4(ii)-(iv) is quite weak and includes a wide class of functions. 
Typical examples include polynomial forms like f(x,8) = 0\ + O2X + • • • + 
6^x k ~ l , where 6 = (9i, ... ,0k), power functions like f(x,a,b,c) = a + bx c , 



(2.5) 




if p>0 
if /3 = 



and E\eq 



4/3+2 



< OO. 
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shift functions like f(x,6) = x(l + 9x)I{x > 0) and weighted exponentials 
such as f(x,a,b) = (a + be x )/(l + e x ). However, Assumption 4 excludes mod- 
els where f(x,0) is integrable, because parametric rates of convergence are 
known to be 0(n 1//4 ) in this case [see Park and Phillips (2001)]. It seems 
that cases with integrable f(x,8) require different techniques and these are 
left for future investigation. 

As in estimation limit theory, the condition in Assumption 5 that the 
bandwidth h satisfies nh? — > oo is necessary. The further condition that 
nh A log 2 n — > restricts the choice of h and, at least with the techniques used 
here, seems difficult to relax in the general case studied in the present work, 
although it may be substantially relaxed in less general models as discussed 
later in the paper. The condition that 5 2 n 1+ Py/h — > holds automatically if 
sup eg Q \\6 — 6\\ = Op(n"( 1+ W 2 ). As explained above, the latter condition 
holds true under very general settings such as Assumption 4(ii)-(iv). We also 
impose a higher moment condition on the innovation eo in Assumption 5 
which helps in the development of the limit theory. 

3. Main results on specification. The limit distribution of S n under stan- 
dardization involves nuisance parameters a and (f>, which are the limit of 
Eu 2 as t — > oo and the sum of coefficients of the linear process appearing in 
Assumption 1; see Corollary 3.1 below. While convenient, this formulation 
obviously restricts direct use of the result in applications. The dependence 
on the nuisance parameters can be simply removed by self-normalization. 
Indeed, by defining 

n 

V 2 = £ u 2 +1 u 2 s+1 K 2 [(x t -x s )/h], 

s,t=l,s^t 

we have the following main result. 

Theorem 3.1. Under Assumptions 1-5 and the null hypothesis, we have 

where N is a standard normal variate. 

The limit in Theorem 3.1 is normal and does not depend on any nuisance 
parameters. As a test statistic, Z n = S n /V2V n has a big advantage in appli- 
cations. In order to investigate the asymptotic power of the test, we consider 
the local alternative models 

H\:f{x) = f(x,6) + p n m(x), 

where 6 S Qq, p n is a sequence of constants, and m(x) is a real function. 
This kind of local alternative model is commonly used in the theory of non- 
parametric inference involving stationary data; see, for instance, Horowitz 
and Spokoiny (2001). 
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Assumption 6. There exists a v > such that 

„ |m(x)| „ \m{x)\ 
(3.2 < inf 1 , V < Csup 1 \ , < oo, 

|*|>i \x\ v ~ x /rI + \x\» 

and there exist < 7' < 1 and max{0, 3/4 — 2v} < 7 < 1 such that 

1 + \x\ u - x + \y\ u , if ^ > 0, 
if v = 0, 



(3.3) \m(x + y)~ m(x)\ < C|y| 7 j 1 + j x j 7 '-i 
for any x,y € R and for some constant C > 0. 



Assumption 6 is quite weak which is satisfied by a large class of real 
functions such as m(x) = a\ + C12X + • — h akX k ~ l , m(x) = a + bx c and m(x) = 
(a + 6e :r )/(l + e x ). If m(x) is positive(or negative) on R, condition (3.3) is 
not necessary. 

Theorem 3.2. In addition to Assumptions 1-6, + \x\ 2u+2 )K{x) dx < 
00 and E\eo\ 4u+2 < 00. Then, under Hi, we have 

(3.4) limpf^^>t 



/or any p n satisfying p^n 1 / 2 ^^^ 2 — >• 00, and /or any < a < 1, where 
&(t a ) = 1 — a and $ is i/ie standard normal c.d.f. 

Theorem 3.2 shows that our test has nontrivial power against the local 
alternative whenever p n — > at a rate that is slower than n-ViW, as 
nh 2 — > 00. This is different from the stationary situation where in general a 
test has a nontrivial power if only p n — > at a rate that is slower than n" 1 / 2 . 
It is interesting to notice that the rate is related to the magnitude of m(x) 
and the bandwidth h. The test has stronger discriminatory power the larger 
the value of v. The reason is that the nonlinear shape characteristics in m(x) 
are magnified over a wide domain and this property is exploited by the test 
because the nonstationary regressor is recurrent. 

Theorem 3.2 seems to be new to the literature. Under very strict restric- 
tions (namely that xt is a random walk and xt is independent of ut), the 
result in Theorem 3.1 has been considered in Gao et al. (2009a). Not only 
the generalization of our result, but the techniques used in this paper are 
quite different from Gao et al. (2009a, 2009b). To outline the essentials of 
the argument in the proof of Theorem 3.1, under the null hypothesis, we 
split S n as 

n n 

S n = 2^n t+1 y nt + 2 u i+ i[f(x u 6) - f(x t ,§)]K[{x t - x t )/h] 

t=2 i,t=l 



8 Q. WANG AND P. C. B. PHILLIPS 
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(3.5) +^[f(x u 9)-f(xJ)}[f(x u e)-f(xJ)}K[(x t -x i )/h} 

i,t=l 

= 2S*i„ + 2S 2n + S 3n say, 

where Y nt = Yli=\ u i+iK[( x t — %i)/h]. It will be proved in Section 6.1 that 
terms S2n and Ss n are negligible in comparison with S± n . Furthermore it 
will be proved that, under the null hypothesis, 

n 

V 2 =^Y. K ^ ~ + o P (n^ 2 h) 

t,s=l 

(3.6) 

t=2 

By virtue of these facts, Theorem 3.1 follows from the following theorem, giv- 
ing a joint convergence result for Si n and its conditional variance ^!t=2^ r nf 
This result, along with the following Corollary 3.1, is of some independent 
interest. 

Theorem 3.3. Under Assumptions 1-3, nh 2 — > oo and nh 4 log 2 n — >■ 0, 
we have 



(3.7) 



/ 1 n 1 n \ 

\ n t=2 n t=2 / 



where d 2 n = (2<p) 1 a 2 n i / 2 h K 2 (x) dx, rj 2 = L G (l,0) is the self intersec- 
tion local time generated by the process G = J * e K ( i_s ) dW{s), and N is 
a standard normal variate which is independent of rj 2 . 

CORROLARY 3.1. Under Assumptions 1-5, we have 

> D rjN, 

where t 2 = (80) _1 cr 4 n 3 / 2 /i K 2 (x) dx , rj 2 and N are defined as in Theo- 
rem 3.3. 

Here and below, we define 

L G (t,u) = KmJ- [ [ l[\{G(x)-G(y))-u\<e]dxdy 
(3.8) I ° ° 

r 5 u [G(x)-G(y)]dxdy, 



o Jo 
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where 5 U is the dirac function. Lc(t, u) characterizes the amount of time over 
the interval [0,t] that the process G{t) spends at a distance u from itself, 
and is well defined, as shown in Section 5. When u = 0, Lc(t,0) describes 
the self-intersection time of the process G(t). Using the definition of the 
dirac function, the extended occupation times formula [e.g., Revuz and Yor 
(1999), page 232], and integration by parts with the local time measure, we 
may write 

L G (t,0) = 2 / [ V S [G(x)-G(y)]dxdy 
Jo Jo 

= 2 f ' £ G (s,G(s))ds 
(3.9) h 

/OO f't 
I £g{s,cl) d£o{s,a) da 
-oo JO 



£c(t, a) 2 da, 

where £(j(t,a) is the local time spent by the process G at a over the time 
interval [0,t], namely, 

£ G (t,a)= [ S a [G(s)]ds = lim — I l\\G(s) - a\ < e]ds. 
Jo 2e Jo 

The process £g(s,G(s)) is the local time that the process G has spent at 
its current position G(s) over the time interval [0,s]. It appears in the limit 
theory for nonparametric nonstationary spurious regression [Phillips (2009)] . 
Aldous (1986) gave (3.9) for the case of Brownian motion. 

It is interesting to note that Si n is a martingale sequence with conditional 
variance X^™=2^?*' suggesting that some version of the martingale central 
limit theorem [e.g., Hall and Heyde (1980), Chapter 3] may be applicable. 
However, the problem is complicated by the U-statistic structure and the 
weak convergence of the conditional variance, and use of existing limit theory 
seems difficult. To investigate the asymtotics of S\ n , we therefore develop our 
own approach. As part of this development, in Section 5, we provide a general 
weak convergence theory to intersection local time, which is of independent 
interest and useful in other applications. The conditions required for this 
development are weaker than those in establishing Theorem 3.3 and that 
section may be read separately. 

We finally remark that the restrictive condition on the bandwidth h in 
Theorems 3.1-3.3 (i.e., n/i 4 log 2 n — > 0) is mainly used to offset the impact 
of the error terms in (3.5) and (3.6). It seems difficult to relax this condition 
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under the prevailing Assumption 2, which allows for endogenity in the re- 
gressor xt- See, for instance, the proof of Proposition 6.4 given in the supple- 
mentary material [Wang and Phillips (2012)]. The restriction n/i 4 log 2 n — > 
on h in Theorems 3.1-3.3, however, can be reduced to the minimal require- 
ment h — > 0, if Assumption 2 is replaced by the following Assumption 2*. 

Assumption 2*. For each n > 1, {ut,Ft jTl }i<t<n forms a martingale 
difference satisfying lim^oo sn P n >t\^( u t+i I ^t,n) — ° 2 \ = 0, a.s. and 

sup E(\u t+1 \ 4 | F t , n ) < oo, 
n>t>\ 

where 

F t , n = o"(«i, • • • ,ut;xi, . . . ,x n ), t = l,2,...,n;ra> 1. 

Note that Assumption 2* holds true if x t is independent of u t , and {ut, 
Ft}t>i forms a martingale difference satisfying E(u 2 +l \ Ft) — > a , a , a 2 > as 
t — > oo and sup t>1 E(\ut+i\ 4 \ Ft) < oo, where Ft is a sequence of increasing 
cr-fields. The independence assumption was used in Gao et al. (2009a) to 
establish a similar version of Theorem 3.1. 

4. Simulations. Simulations were conducted to evaluate the finite sample 
performance of the statistic Z n = S n /V2V n under the null and some local 
alternatives under various assumptions about the generating mechanism. 
The results are summarized here, and more detailed findings are reported 
in the supplementary material [Wang and Phillips (2012)]. The model fol- 
lowed (2.1) with y t+ i = f(x t ) + u t+ i, x t = x t -i + r) t , x = 0, and r/ t generated 
by an AR(1) process rjt = Xrjt-l + £t or an MA(1) process r\t = £t + \et-i 
with (ut,£t) ~ i.i.d. N(0, ( r 1 )). A linear null hypothesis Hq : f(x) = 9q + 9ix 
was used together with polynomial local alternatives Hi : f(x) = 9o + 9\x + 
p n \x\ u , with p n = l/(n 1/ ' 4+I// ' 3 /i 1//4 ). The parameter settings were #o = 0,9i = 
1, v G {0.5, 1.5, 2, 3} and r G {0, ±0.5, ±0.75}. Results are reported for sample 
sizes n G {100, 200, 500} and bandwidth settings h = rT v for p G |, ^5}. 
Note that h = n~ 1//4 satisfies Assumption 2* but not Assumption 2. The 
number of replications was 5000. 

Table 1 shows the actual size of the test for various n and bandwidth 
choices h and for both exogenous (r = 0) and endogenous (r = ±0.5) regres- 
sor cases with serially uncorrelated errors (A = 0). Table 2 shows the cor- 
responding results for AR errors with A = ±0.4. Size results for MA errors 
are similar and are given in the supplementary material [Wang and Phillips 
(2012)]. Under i.i.d. errors the test is somewhat undersized for n = 100,200 
but is close to the nominal for n = 500 and for all bandwidth choices. There 
is some mild oversizing under serially dependent rjt when A = —0.4 for band- 
width h = n~ 1//4 , but size seems satisfactory for A = 0.4 and for the smaller 
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Table 1 
Size: rjt = £t 



Nominal size 5% Nominal size 1% 



n 


h= n" 1 / 4 


















r = 








100 


0.028 


0.035 


0.033 


0.006 


0.006 


0.007 


200 


0.034 


0.042 


0.041 


0.007 


0.007 


0.008 


500 


0.044 


0.045 


0.050 


0.009 


0.010 


0.010 








r = 0.5 








100 


0.030 


0.035 


0.040 


0.006 


0.007 


0.007 


200 


0.038 


0.044 


0.045 


0.009 


0.008 


0.008 


500 


0.041 


0.045 


0.048 


0.008 


0.009 


0.009 








r = -0.5 








100 


0.031 


0.035 


0.037 


0.007 


0.008 


0.008 


200 


0.036 


0.045 


0.046 


0.007 


0.008 


0.009 


500 


0.041 


0.047 


0.051 


0.009 


0.010 


0.011 



bandwidths h = re -1 / 3 , n -1 / 2 ' 5 . Since negative A reduces the long run mov- 
ing average coefficient 4> [4> = 1/(1 ~~ ^) f° r AR rjt] these results suggest that 
the strength of the long run signal in xt (measured by the long-run variance 
of rjt) affects the performance of the test. On the other hand, endogeneity at 
the correlation level r = ±0.5 appears to have little effect on performance, 
which mirrors results for estimation in the nonlinear nonstationary case 
[Wang and Phillips (2009b)]. Higher levels of correlation (r = ±0.75) pro- 
duce some size distortion when there is serial dependence, but not when the 
errors are independent; see Table 3. 

Table 4-6 show test power against the local alternative H\ for polynomial 
alternatives (cubic v = 3, quadratic v = 2 and three halves v = 1.5). Results 
for the case v = 0.5 are given in the supplementary material [Wang and 
Phillips (2012)]. Again, there is little difference between the exogenous and 
endogenous cases, so only the endogenous case is reported here. As may 
be expected, there is greater local discriminatory power for cubic (y = 3) 
than quadratic (y = 2) or three halves [y = 1.5) alternatives. For n = 100 
(500) power is greater than 69% (90%) for a nominal 1% test and greater 
than 74% (92%) for a nominal 5% test when v = 3 under AR errors with 
A = 0.4 (Table 4). The corresponding results when v = 2 and n = 100 (500) 
are 15% (38%) for a nominal 1% test and 23% (46%) for a nominal 5% test 
(Table 5). Serial dependence affects power, which is higher for A = 0.4 than 
for A = —0.4 in all cases. So lower long-run signal strength in the regressor 
tends to reduce discriminatory power. For v = 1.5 and A = —0.4, power is low 
even for n = 500 (2±% for a 1% test and 7±% for a 5% test, Table 6). Low 
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Table 2 
Size: r] t = Xr/t-i + £t, r = ±0.5 



Nominal size 5% Nominal size 1% 



n 


h = n" 1 / 4 












n -l/2.5 






T 


= 0.5, A = 


0.4 








100 


0.034 


0.038 


0.041 




0.002 


0.004 


0.005 


200 


0.044 


0.044 


0.047 




0.004 


0.006 


0.007 


500 


0.058 


0.058 


0.057 




0.007 


0.010 


0.011 






r = 


= 0.5, A = 


-0.4 








100 


0.038 


0.042 


0.046 




0.013 


0.013 


0.011 


200 


0.051 


0.051 


0.051 




0.018 


0.015 


0.014 


500 


0.070 


0.061 


0.057 




0.026 


0.022 


0.016 






r = 


= -0.5, A = 


= 0.4 








100 


0.034 


0.038 


0.040 




0.002 


0.004 


0.005 


200 


0.044 


0.044 


0.048 




0.004 


0.006 


0.007 


500 


0.058 


0.058 


0.057 




0.007 


0.009 


0.011 






r = 


-0.5, A = 


-0.4 








100 


0.035 


0.040 


0.043 




0.012 


0.012 


0.012 


200 


0.050 


0.049 


0.050 




0.018 


0.015 


0.013 


500 


0.073 


0.064 


0.056 




0.026 


0.018 


0.016 



power also occurs against the local alternative with v = 0.5 [see Wang and 
Phillips (2012)], which also reduces signal strength in the regressor function. 
Thus, discriminatory power is dependent on the specific alternative and, as 
asymptotic theory suggests, is sensitive to the magnitude rate (y) of m(x) 
as \x\ — > oo. 

Overall, the finite sample results reflect the asymptotic theory and seem 
reasonable for practical use in testing when there is some endogeneity in non- 
parametric nonstationary regression, especially if smaller bandwidth choices 
than usual are employed. In cases of serial dependence when the long-run 
signal strength in the regressor xt is reduced, finite sample adjustments for 
the test critical values may be useful in correcting size, as has been found 
for i.i.d. and stationary regressors [Li and Wang (1998)]. 

In practice, the exact a-level critical value £ a (h) (0 < a < 1) of the finite 
sample distribution of S n /V2V n depends on all the unknown parameters 
and functions in the model. The development of a rigorous theory of ap- 
proximation for £ a (h) and the choice of an optimal bandwidth for use in 
testing are challenging problems in the nonstationary setting. Gao et al. 
(2009a) provided an approximate value of l a (h) by using the bootstrap and 
considered numerical solutions for a bandwidth h that optimizes the power 
function, both under the assumption that xt and u% are independent. It is 
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Table 3 
Size: rj t = A77 t _i + e t , r = ±0.75 



Nominal size 5% Nominal size 1% 



n 


h = n" 1 / 4 


n" 1 / 3 






n" 1 / 3 


n -l/2.5 






r - 


= 0.75, A = 


0.4 






100 


0.036 


0.038 


0.039 


0.003 


0.003 


0.004 


zuu 


U.U4o 


n n/iQ 
u.U4y 


u.uou 


u.uuo 


U.UUD 


n nn7 

U.UU / 


ouu 


u.uo / 


u.uoo 


U.UOo 


u.uu / 


u.uuy 


U.UUo 






/■ = 


0.75, A = 


-0.4 






100 


0.074 


0.068 


0.027 


0.036 


0.033 


0.027 


zuu 


U. 1UO 


u.uyo 


n ns7 

U.Uo / 


u.uou 


U.U^io 


U.Uo^ 


ouu 


U.lf 1 


U. 14U 


U. 1 10 


u.uy4 


U.UDZ 


n n/i s 

U.U4o 






T 


= 0.75, A = 


- () 






100 


0.026 


0.029 


0.032 


0.005 


0.006 


0.006 


200 


0.037 


0.044 


0.046 


0.007 


0.008 


0.010 


500 


0.040 


0.042 


0.047 


0.008 


0.009 


0.009 






r - 


= -0.75, A 


= 






100 


0.027 


0.035 


0.036 


0.005 


0.008 


0.007 


200 


0.036 


0.040 


0.043 


0.008 


0.010 


0.010 


500 


0.041 


0.045 


0.044 


0.008 


0.008 


0.009 






r = 


-0.75, A = 


= 0.4 






100 


0.074 


0.071 


0.063 


0.003 


0.004 


0.004 


200 


0.103 


0.085 


0.074 


0.011 


0.012 


0.011 


500 


0.135 


0.105 


0.088 


0.027 


0.020 


0.015 






r = 


-0.75, A = 


-0.4 






100 


0.070 


0.066 


0.065 


0.033 


0.026 


0.023 


200 


0.109 


0.094 


0.087 


0.055 


0.042 


0.033 


500 


0.175 


0.136 


0.109 


0.093 


0.065 


0.048 



not clear at the moment whether similar techniques can be rigorously justi- 
fied in the current general model and there is presently no optimal approach 
to bandwidth selection. The investigation of such finite sample adjustments 
and selection criteria is therefore left for later research. Earlier analysis of 
the restrictions on the bandwidth in Theorems 3.1-3.3, in conjunction with 
the simulation evidence, indicates that smaller bandwidths than usual for 
stationary regression are likely to be more reliable in practical work for 
specification testing of nonlinear nonstationary regression. 

5. Convergence to intersection local time. Consider a linear process {rjj , 
j > 1} defined by rjj = YlkLo&ktj-ki where {ej,j € Z} is a sequence of i.i.d. 
random variables with Eeo = and Ee^ = 1, and the coefficients k > 
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Table 4 

Local power: v = 3, r/t = Arjt-i + St, r = ±0.5 



Nominal size 5% Nominal size 1% 



n 


h = n" 1 / 4 












n -l/2.5 






T 


= 0.5. A = 


0.4 








100 


0.819 


0.779 


0.743 




0.787 


0.739 


0.693 


200 


0.906 


0.878 


0.845 




0.892 


0.849 


0.811 


500 


0.971 


0.950 


0.923 




0.963 


0.935 


0.901 






r = 


= 0.5, A = 


-0.4 








100 


0.247 


0.211 


0.179 




0.197 


0.154 


0.126 


200 


0.358 


0.306 


0.265 




0.302 


0.247 


0.199 


500 


0.522 


0.448 


0.389 




0.458 


0.376 


0.310 






r - 


= -0.5, A = 


= 0.4 








100 


0.829 


0.780 


0.743 




0.792 


0.742 


0.696 


200 


0.910 


0.879 


0.845 




0.891 


0.851 


0.813 


500 


0.965 


0.947 


0.921 




0.957 


0.931 


0.903 






r = 


-0.5, A = 


-0.4 








100 


0.238 


0.204 


0.176 




0.189 


0.151 


0.127 


200 


0.352 


0.297 


0.253 




0.295 


0.239 


0.193 


500 


0.513 


0.431 


0.367 




0.449 


0.367 


0.301 



are assumed to satisfy YlfcLol^kl < 00 an d 4> = J2T=o ^ 0- Let 

(5.1) yk,n = pyk-i,n + Vk, Vo,n = 0, p=l + n/n, 

where k is a constant. The array yk,m k > is known as a nearly unstable 
process or, in the econometric literature, as a near-integrated time series. 
Write 

%k,n — yk,n/ ' The classical invariance principle gives 



,K(t-s) 



W{s)ds 



(5.2) x [ntln G(t) := / dW(s) = W(t) + k [ 

Jo Jo 

on D[0, 1], where W(t) is a standard Brownian motion [e.g., Phillips (1987), 
Buchmann and Chan (2007), Wang and Phillips (2009b)]. Furthermore, 
€ Z} can be redefined on a richer probability space which also con- 
tains a standard Brownian motion W\{t) such that 



(5.3) 



sup |x [nt ] jn -Gi(t)| =o P (l) 



0<t<l 

-t 



where G\(t) = W\{t) + kJ e K (* s ^Wi(s) ds. Indeed, by noting on the richer 
space that 



(5.4) 



sup 

0<t<l 



nt] 



Wi(t) 



o P (l) 
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Table 5 

Local power: v — 2, rjt = Xr] t -i + £t, r = ±0.5 



Nominal size 5% Nominal size 1% 



n 


h = n" 1 / 4 


n" 1 / 3 










n -l/2.5 






V 


= 0.5, A = 


0.4 








100 


0.357 


0.282 


0.228 




0.282 


0.205 


0.147 


200 


0.484 


0.389 


0.315 




0.418 


0.310 


0.228 


500 


0.682 


0.557 


0.458 




0.616 


0.482 


0.376 






r = 


= 0.5, A = 


-0.4 








100 


0.058 


0.054 


0.053 




0.027 


0.020 


0.016 


200 


0.103 


0.083 


0.068 




0.048 


0.034 


0.024 


500 


0.169 


0.118 


0.094 




0.098 


0.057 


0.036 






r - 


= -0.5, A = 


= 0.4 








100 


0.114 


0.123 


0.128 




0.065 


0.066 


0.067 


200 


0.226 


0.235 


0.244 




0.157 


0.159 


0.160 


500 


0.437 


0.457 


0.462 




0.350 


0.359 


0.367 






r = 


-0.5, A = 


-0.4 








100 


0.056 


0.050 


0.046 




0.022 


0.016 


0.014 


200 


0.102 


0.082 


0.066 




0.053 


0.031 


0.022 


500 


0.173 


0.123 


0.096 




0.103 


0.061 


0.037 



[see, e.g., Csorgo and Revesz (1981)], and using this result in place of the 
fact that e j =>" W(t) on D[0, 1], the same technique as in the proof 

of Phillips (1987) [see also Chan and Wei (1987)] yields 

[nt] 

Gi(t) 



sup 

0<t<l 



1 y p [nt]- j( 

fin t-^ r 



o P (l). 



The result (5.3) can now be obtained by the same argument, with mi- 
nor modifications, as in the proof of Proposition 7.1 in Wang and Phillips 
(2009b). 

The aim of this section is to investigate the asymptotic behavior of a func- 
tional S\ nr ] of the Xfc n , defined by 



(5.5) 



St, 



2 ^ ] 9\ c n(%k,n x j,n)]i 
k,j=l 

where g is a real function on R, and c n is a certain sequence of positive 
constants. Under certain conditions on g(x), cq and c n , it is established 
that, for each fixed < r < 1, S[ nr ] converges to an intersection local time 
process of G(t). Explicitly, we have the following main result. 
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Table 6 

Local power: v = 1.5, r\ t = \r\t-\ + Et, r = ±0.5 



Nominal size 5% Nominal size 1% 



n 


h= n" 1 / 4 




n -l/2.5 




n" 1 / 4 


n" 1 / 3 


n -l/2.5 






7* 


= 0.5, A = 


0.4 








100 


0.058 


0.051 


0.045 




0.021 


0.012 


0.010 


200 


0.087 


0.065 


0.057 




0.040 


0.022 


0.015 


500 


0.158 


0.103 


0.077 




0.096 


0.046 


0.024 






r - 


= 0.5, A = 


-0.4 








100 


0.043 


0.040 


0.041 




0.016 


0.014 


0.012 


200 


0.061 


0.058 


0.055 




0.024 


0.019 


0.015 


500 


0.096 


0.074 


0.070 




0.038 


0.031 


0.023 






r = 


= -0.5, A = 


= 0.4 








100 


0.066 


0.053 


0.050 




0.025 


0.015 


0.011 


200 


0.093 


0.065 


0.052 




0.046 


0.023 


0.015 


500 


0.152 


0.094 


0.090 




0.088 


0.042 


0.023 






r — 


-0.5, A = 


-0.4 








100 


0.049 


0.049 


0.049 




0.018 


0.017 


0.013 


200 


0.063 


0.058 


0.059 




0.024 


0.021 


0.017 


500 


0.092 


0.074 


0.064 




0.037 


0.029 


0.021 



Theorem 5.1. Suppose that j^' oo \g{x)\dx < oo, oj = J^' 00 g{x) dx 7= 
and \Ee lte °\ dt < 00. Then, for any c n — > 00, n/c n — > 00 and fixed r G 

(0,1], 

(5.6) S [nr] ^ D ujL G (r,0), 

where Lc(t,u) is the intersection local time of G(t) defined in (3.8). Fur- 
thermore, under the same probability space for which (5.3) holds, we have 
that, for any c n — > 00 and n/c n — > 00, 

(5.7) sup \S[ nr ] -uL Gl (r,0)\ 0. 

0<r<l 

The integrability condition on the characteristic function of eq can be 
weakened if we place further restrictions on g{x). Indeed, we have the fol- 
lowing theorem. 

Theorem 5.2. Theorem 5.1 still holds if $™ \Ee lte °\ dt < 00 is replaced 
by the Cramer condition, that is, limsup| t |_ !>oo |£'e l * f:0 | < 1, and, in addition 
to the stated conditions already on g{x), we have \g{x)\ < M/(l + for 
some b > 0, where M is a constant. 



SPECIFICATION TEST FOR NONSTATIONARY MODEL 



17 



It is interesting to notice that the additional condition on g(x) in The- 
orem 5.2 cannot be reduced without further restriction on eo like that in 
Theorem 5.1. This claim can be explained as in Example 4.2.2 of Borodin 
and Ibragimov (1994) with some minor modifications. On the other hand, 
the asymptotic behavior of Sj nr ] when c n = 1 is quite different, as seen in 
the following theorem. 

Theorem 5.3. Suppose that g(x) is Borel measurable function satisfying 

t-K 

(5.8) lim / sup \g(x + u) — g(x)\ dx = 

h^OJ-K \u\<h 

for all K > and some < a < 1. Then, under the same probability space 
for which (5.3) holds, we have 



(5.9) sup 

0<r<l 



[rar] r r 

~2 9 ( x k,n ~ x jtn ) - / / g[Gi(u) - G} 

n k,j=l J ° J ° 



(v)] du dv 



op(1) 



We mention that condition (5.8) is quite weak. Indeed, example 2.8 and 
the discussion following Theorem 2.3 in Berkes and Horvath (2006) shows 
that (5.8) cannot be replaced by 

i-K 

lim / \x\ a ~ 1 \g(x + u) — g(x)\dx = 

for all K > and some < a < 1 . 

Local time has figured in much recent work on parametric and non- 
parametric estimation with nonstationary data. Motivated by nonlinear re- 
gression with integrated time series [Park and Phillips (1999, 2001)] and 
nonparametric estimation of nonlinear cointegration models, many authors 
[Phillips and Park (1998), Karlsen and Tj0stheim (2001), Karlsen, Mykle- 
bust and Tj0stheim (2007), Wang and Phillips (2009a)] have used or proved 
weak convergence to the local time of a stochastic process, including results 
of the following type: under certain conditions on the function g, the limiting 
stochastic process G(t), a sequence c n — > oo, and normalized data x^^n 



(5.10) -Es(w,r.)^^(l,0), 



k=l 



where £a(t,s) is the local time of the process G(t) at the spatial point s. 
We refer to Borodin and Ibragimov (1994) (and their references for related 
work) for the particular situation where c n Xk, n is a partial sum of i.i.d. ran- 
dom variables, and to Akonom (1993), Phillips and Park (1998), Jeganathan 
(2004) and de Jong and Wang (2005) for the case where c n Xk, n is a partial 
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sum of a linear process. Wang and Phillips [(2009a), Theorem 2.1] general- 
ized these results to include not only linear process partial sums but also 
cases where c n Xk t n is a partial sum of a Gaussian process, including frac- 
tionally integrated time series. 

Our present research on the statistic S^ nr ^ in (5.5) has a similar motivation 
to this earlier work on convergence to a local time process. However, the 
statistic Si nr ] has a much more complex U-statistic form, and the technical 
difficulties of establishing weak convergence are greater. The approach of 
Wang and Phillips [(2009a), Theorem 2.1] remains useful, however, and is 
implemented in the proofs of Theorems 3.1-3.3. 

Finally we mention some earlier work investigating the intersection local 
time process and weak convergence for certain specialized situations. This 
work restricts the function g in (5.5) to the indicator function and the dis- 
crete process yk <n in (5-1) to a lattice random walk taking integer values; 
see, for instance, Aldous (1986), van der Hofstad, den Hollander and Konig 
(1997), van der Hofstad and Konig (2001) and van der Hofstad, den Hol- 
lander and Konig (2003). The present paper seems to the first to consider 
weak convergence to intersection local time for a general linear process and 
a general function g. 

The proofs of Theorems 5.1-5.3 are given in the supplementary material 
[Wang and Phillips (2012)]. 

6. Proofs of Theorems 3.1-3.3. We start with several propositions. Their 
proofs are given in the supplementary material [Wang and Phillips (2012)]. 
Throughout the section, we let C, C%, C*2, ... be constants which may differ 
at each appearance. 

Proposition 6.1. Suppose Assumptions 1 and 2 hold. For any a\,ct2 > 0, 
ifsup x \p(x)\ <oo, J (l + \x\™ x ^^ +1 )\p(x)\ dx<oo and E\e \ [ai]+[a2]+2 < 
oo, then 

n 

An-=J2 <?K+l)2lK+l)(l + N ai )(l + \x t \ a2 )p[{x t - x s )/h] 
s,t=l 

(6.1) 

where g(x) and gi(x) are real functions such that 

sup E{\g 2 (u s+1 ) +gf(u s+ i)] | J" s } < oo. 

S>1 

If additionally a± > 0, then 

A n := gius+ijil + lxsl^Mixt-x^/h] 

Ks<t<n 

(6.2) 

= P (n max{3/2 ' 1+Ql/2} /i). 
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Proposition 6.2. Suppose Assumptions 1-3 hold. Then, for any g(x, 9) 
satisfying (2.5) and \g(x,Q)\ < C(l + \x\@), where 9 6 Qq, we have 

n 

(6.3) A n := u s+1 g(x u 9)K[(x t - x s )/h] = P (n 5 / 4+ ^ 2 h 3 / 4 ), 

s,t=l 

provided that nh 2 — > oo, nh 4 — > 0, J(l + |x|^ +1 )-fC(2;) dx < oo and E\eo\^ +2 < 
oo. Similarly, (6.3) holds true if we replace g(x,9) and (5 by m{x) and v, 
respectively, where m(x) is defined as in Assumption 6. 

Proposition 6.3. Suppose Assumptions 1-3 hold and nh 2 — > oo. Then, 
for any real function g{x) satisfying sup s>1 E{g 2 {u s+ i) \ F s } < oo, we have 

n 

(6.4) T n := g(u s+l ){u 2 t+1 - a 2 )K 2 [(x t - x s )/h] = o P (n 3 / 2 h). 

s,t=l 

Proposition 6.4. In addition to Assumptions 1-3, we have \uj\ < A 
and nh 2 — >■ oo . Then, 

n t-1 

(6.5) R n := Yl u i+1 u j+1 K[(x t - Xi )/h]K[{x t - Xj )/h] = o P (n^ 2 h). 

t=l i,j=i 

Proposition 6.5. Under Assumptions 1-3 and hlog 2 n— >0, we have 

(6.6) EZf kr <C max E[\ Ui \(l + |uj|)](l + hy/t-r-k) 

l<i,jr'<n 

for 1 < k < t — r and r > 1, where Z tkr = Yli=k u i+l^[( x t —Xi)/h]. Similarly, 

(6.7) E {l>*+1 " E ( U W I F 3)\K 2 [{x t - Xi )/h]\ < C7(l + hVt). 

If in addition \uj\ < A, where A is a constant, then 

(6.8) EZ? 12 < ChH 3 / 2 , 
and for any 1 < m < t/2, 

. . Ch 2 t 2 Ch 2 t\og(t-m) Ch 2 t 
6.9 EZ£<—^ + yL >- + , 



where Z? m = Ya=T 1 u i+1 E(K[(x t - Xi)/h] \ T x 



t—m ) 
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6.1. Proof of Theorem 3.1. By virtue of (3.5) and Theorem 3.3, it suffices 
to verify (3.6) and show that 



(6.10) 



S 2n = o P {n 3 / i Vh) and S 3n = o P {n 3 ^Vh). 



To this end, for 5 > 0, let f2 n = {8: \\9 - 8\\ < 55 n ,6 G O }, where o n is given 
in Assumption 4(i). 

We first prove (6.10). Note that £l n C Oo for all n sufficiently large. Under 
Assumption 4, it follows by Taylor's expansion that, whenever n is suffi- 
ciently large and 9 £ f2 n , 



.11) 



where 



S- 



2n 



-A dfixu&) 

u i+l TiQ K i\ x t ~ Xi)/h\ + S 2 nl 



i,t=l 



06 



s 2 m < c\e - e\ 2 ]T \ Ui+1 \(i + \x t f)K[( Xt - Xi )/h}. 

i,t=l 

By Proposition 6.2 with g(x,9) = 9 ^qq^ and S 2 l n 1+/3 \' r h — > 0, the first term 
in the decomposition of S\ n is equal to 

Op(5nn 5/4+/3/2 /> 3 / 4 ) = o P (n 3 / A Vh). 

On the other hand, by Proposition 6.1 and nh 2 — > oo, we get 

S 2 ni = P {5 2 yi 2+ PI 2 h) = o P (n^Vh). 

These facts imply, for any 5 > 0, 

P(\S 2n \>5n 3 / 4 Vh) 

(6.12) <P(\S 2n \>5n^ 4 Vh,9en n ) + P(\\9-8\\>55 n ) 

— > as n — > oo. 

Similarly, by using Proposition 6.1 and noting 



\S 3n \<C\9-8\ 2 ]T 

i,t=l 



df( Xi ,9) 



09 



df(x t ,9) 



00 



K[(x t - Xi)/h] 



.13) 



< C5l £ (1 + \ Xi f){l + \x t f)K[(x t - Xi)/h] 

i,t=l 

= Op(5 2 n n 3 / 2 ^h) = op(n^ 4 Vh), 
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whenever 9 G O n , we obtain, for any 5 > 0, 

(6.14) < P(|5 3n | > 5nV 4 Vh, 6en n )+ P(\0 - 0\ > 56 n ) 

—7-0 as n — > oo. 

Combining (6.12) and (6.14), we obtain (6.10). 
We next prove (3.6). We may write 

n 

V^=Y,^ + i^t + iK 2 [(xt-x s )/h} 

s,t=l 

n 

+ ^ (u 2 s+1 - u 2 s+l )u 2 t+l K 2 [{x t - x s )/h] 

s,t=l 
s^t 



(6.15) 



n 

+ u s+iOm - u t+i) K \ x t ~ x s )/h] 

s,t=l 



■■=v ln + v 2n + v 3n . 

Recall \f(x s ,9) — f(x s ,9)\ < CS n (l + \x s \@) whenever 9 G fl n and \u 2 +1 
u 2 t+l \ = 2\u t+ i\\f(x s ,9) - f(x s ,9)\ + \f{x s ,9) - f(x s ,9)\ 2 . It is readily seen 

from Proposition 6.1 that, given 9 G £l n , 

n 

\V 2n \ + \V 3n \ < C5 n K+ik+iU + \x s f)K[{x t - x s )/h] 

s,t=l 

II 

+ C5l u 2 +1 {l + \x s \ 2 ^)K[{x t - x s )/h] 

s,t=l 
n 

+ C8l Y + \x s f)(l + \x t \ w )K[{x t - x s )/h] 

s,t=l 
n 

+ CSt Y (1 + l^| 2/3 )(! + \xt? P )K[{x t - x s )/h] 

s,t=l 

= P {n^ 2 h){5 n n^ 2 + 5 2 y + S 3 n n^ 2 + 5^) 
= o P (nV 2 h), 
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since nh 2 — > oo and 5 2 n 1+/3 \/7i — > 0. As for V\ n , by Proposition 6.3, we have 

n n 

V ln = a 4 E ~ x ^)/h] + E ( n m + cj2 )(^+i " ° 2 )K\x t - x s )/h] 

s,t=l s,t=l 
n 

= a 4 Yl K ^ Xt ~ x »)l h \ + op(n 3/2 h). 

s,t=l 
s^t 

Taking these estimates into (6.15), we get the first part of (3.6). 

In order to prove the second part of (3.6), we first assume \uj\ < A. In 
this case, simple calculations together with Propositions 6.3 and 6.4 yield 
that 

n n t— 1 

Y, Y nt = EE^+i 1 ^ - *•)/*] 

t=2 t=2 s=l 

n t-1 

(6.16) + n i+1 n J -+ii ; r[(x t - sci)//i]if [(a; t - Xj)//i] 

i=i ij=i 

2 n 

= Y E K2 [( x * - + Mn 3/2 h) 

s,t=l 

as required. The idea to remove the restriction \uj \ < A is the same as in 
the proof of Theorem 3.3. We omit the details. The proof of Theorem 3.1 is 
now complete. 

6.2. Proof of Theorem 3.2. Put u* +1 = u t+ i + f(x t ,9) - f(x t ,9). Un- 
der Hi, we may write 

(6.17) S n = Si n + 25*2,1 + Ssn — Sin + S^n, 

where Si n , S^n, S-s n are defined as in (3.5), and 

n 

Sin = 2>Pn E m ( x i)tft+l K [( x t ~ Xi)/h], 
i,t=l 

n 

S 5n = Pn E m ( x i) m ( x t)K[(x t ~ Xi)/h). 
i,t=l 
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Thus (3.4) will follow if we prove 

(6.18) S ]n = P {n 3 / i h l l 2 ), j = 1,2,3, 

(6.19) S An = P (p n n 5 / 4+v / 2 h 3 /% 

(6.20) V 2 = P {n 3 ' 2 h + p A n n 3 / 2+2u h) under H 1} 
and for any e n — > 0, 

(6.21) S 5n > e n p 2 n n 3/2+v h in Probab. 

Here and below, the notation A n > B n , in Probab. means that \\m. n ^ (yo P{A n > 
B n ) = 1, as n — > oo. Indeed, by choosing e~ 2 = mm{p 2 l n 1 ^ 2+u n 3 ^ 2 Vh}, 
it is readily seen that e n — > 0, \Sj n \ = Op{e n S^ n ) = op{S^ n ) for j = 1,2,3,4 
and S^n/Vn > e~ , in Probab. Hence S n /V n > e^ 1 /2, in Probab., which 
yields (3.4). 

We next prove (6.19)-(6.21). The proof of (6.18) for j = 2,3 is given 
in (6.10), and the result for j = 1 is simple by martingale properties and 
Proposition 6.5. 

Equation (6.21) first. We may write 

(6.22) Sz, n = S^ni + Ssn2, 

where S 5nl = 2/%Y,i<i<t<n m2 ( x i) K [( x t ~ x i)/h] and 

|5 , 5 „2|<2/)^ ^ \m{xi)\\m(x t ) -m{xi)\K[{x t - Xi)/h}. 

l<i<t<n 

Let v' = v if v > and u f = 7' if u = 0. It follows from (3.3) and Proposi- 
tion 6.1 that 

\S 5n2 \<Ch~<pl ^ + \xi\ u ){l + \xi\ u '- 1 + \x t -x i \ u )K 1 [(x t -x i )/h] 

l<i<t<n 

<ChT'pl Y, {(l + l^r'-' + lx^'-^K^xt-xO/h} 

l<i<t<n 

(6.23) + h"(l + \ Xi \ u )K v+1 [{x t - Xi )/h}} 

= Op(/ i 1+ >2[ n ma x {3/2,l+(,+,')/2} +n 3/2+,/ 2]) 

= o P (h^p 2 y/ 2 +n, 

where K u (x) = \x\ u K(x), u > and we have used the fact that supJ-?T u (x)| < 
00 whenever J K u {x) dx < 00 [recall sup x |i^(^)| < 00]. Since h — > and < 
7 < 1, to prove (6.21), it only needs to show that, for any K 1 ! 2 < e n — > 0, 

(6.24) S 5nl > e n p 2 n n 3/2+u in Probab. 
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In fact, by (5.3) and letting x [ns])H = x [ns] /(^/ri(j)), 

inf \xj\>y/n</)[ inf \Gi(s)\ - sup \x, ns]n - G^s) 

n/2<j<n Vl/2< S <1 1/2<s<1 

(6.25) 

> ej/^y/n in Probab. 
Similarly, by using (5.7) in Theorem 5.1, we have 

(6.26) Y K H x t ~ x 0/ h ] > ^n z/2 h in Probab. 

n>t>i>n/2 

Combining (3.2), (6.25) and (6.26), we obtain that 

S 5n l > e\l 2 pl N 2 ^(N > l)K[(x t - Xi)/h] 

n>t>i>n/2 

>%*for Y KKxt-xj/h] 

n>t>i>n/2 

> e n p 2 n n i/2+v h in Probab. 

This provides (6.24) and also completes the proof of (6.21). 
Next prove (6.19). We have 

n 

Sin = 2p n Y m(xi)u t+ iK[(x t - Xi)/h] + 5 4n i, 

i,t=l 

where, by recalling \f(x t ,6) — f(xt,9)\ < C\\6 — 9\\(1 + \xt\^) by Assump- 
tion 4, it follows from Proposition 6.1 that 

n 

l^ml < Y 0- + \^i\ v )\f{xu 0) - f(x t , 0)\K[(x t - Xi )/h] 

i,t=l 

n 

< c Pn \\e - e\\ Y( 1 + + \ x tf) K [&t - xt)/h] 

i,t=l 

= P (p n 5 n n^l 2 ^l 2 h). 
This, together with Proposition 6.2, yields that 

S 4n = P (p n 5 n n^l 2 ^l 2 h) + P ( Pn n^l 2 h^) 
= P ( Pn n 5 /^/ 2 h 3 /% 
since 5 2 l n l+ ^ y/n — > 0. The result (6.19) is proved. 
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Finally, we prove (6.20). Under H\, we have 



(6.27) 
where 



V 2 = ^2 K+i + Pnm{xt)) 2 [u* s+ i + p n m(x s )) 2 K 2 [(xt - x s )/h] 

s,t=l,s^t 

<2V 6n + 4V 7n + 2V 8n , 



V 6n =J2KliKliK 2 [(x t -x s )/h], 

s,t=l 

n 

Vin = p 2 nYl u* t lim 2 (x s )K 2 [(x t - x s )/h], 

s,t=l 
n 

v 8n = pf^2 ™ 2 {xt)'m 2 (x s )K 2 {(xt - x s )/h}. 

s,t=l 

By recalling |m(x)| < C|x|^ and 

u* 2 +1 < 2(u 2 t+l + \f(x t , 6)-f(xJ)\ 2 ) 
<C[u 2 t+1 + P (Sl)(l + \x t n, 

it following repeatedly from Proposition 6.1 and <5^n 1+ ^\//i — > that 
n 

V 6n <CJ2 [u 2 s+1 + P {5l){l + \x s n][u 2 t+1 + P (5 2 n )(l + \x t \ 2 ?)\ 

s,t=l 

x K 2 [(x t — x s )/h] 
= P (n 3 / 2 h) + Opidln^h) + Op{5 A y 2+2 ^h) 
= P (n 3 / 2 h). 
Similarly, we have 



V 7n < Cp 2 n ]T + P {5 2 n ){\ + \x s \ 2 ?)]{l + \x t \ 2 »)]K 2 [{x t - x s )/h] 

s,t=l 

= Op( P 2 y' 2 ^h) + P (pl5 2 y/ 2 ^h) = Op{pln^h\ 
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n 

V% n <p 4 n ^2(l + N 2l/ )(1 + \x s \ 2 »)K 2 [{x t - x s )/h] 

s,t=l 

= P { P in^h). 
Combining all these estimates, we obtain 

Vl = P {n^h) + P (p 2 n n 3 / 2 ^h) + P {pW 2+2 »h) 

= P (n 3 / 2 h + p 4 n n 3 / 2+2 »h) 
as required. The proof of Theorem 3.2 is complete. 

6.3. Proof of Theorem 3.3. We first assume \ut\ < A, where A is a con- 
stant. This restriction will be removed later. Write G n (t) = xy nt \/ ' \fn<f> and 

V n (t) = Uj+i/ yjna. Under Assumptions 1 and 2, the same arguments 
as those in Buchmann and Chan (2007) or Wang and Phillips (2009b), with 
minor modifications, show that 

(6.28) (G n ,V n )=> D {G,V) 

on D[0, l] 2 , where G(t) = W(t) + k Jj e K ^W(s) ds. By virtue of (6.28), it 
follows from the so-called Skorohod-Dudley-Wichura representation theo- 
rem that there is a common probability space (Q, J 7 , P) supporting (G^,V^) 
and (G,V) such that 

(6.29) (G n ,V n )= d (G° n ,V°) and (G° n , U n °) (G, V) 

in D[0, l] 2 with the uniform topology. Moreover, as in the proof of Lemma 2.1 
in Park and Phillips (2001), V® can be chosen such that, for each n > 1, 

(6.30) V°(k/n) = V(r nk /n), k = l,2,...,n, 

where T nik , 1 < k < n, are stopping times with respect to J^ lk in (f2, J 7 , P) 
with 

7%,k = °{V(r),r < Tn , k /n; G n (s/n),s = 1, . . . ,k + 1}, 
satisfying r„ i0 = 0, 



(6.31) sup -> a . s . 

l<fc<n n° 

as n — > oo for any 1/2 < S < 1, and 

- r n fc _i) I J^ 1 = o-~ 2 £;[n| +1 I and 

(6.32) 

S[(r„, fc -r n , fc _ 1 )^|jJ fc _ 1 ]<C(7- 4m ^[«tfi|J r fc], m>l, a.s. 
for some constant C > 0. We mention that result (6.32) does not explicitly 
appear in Lemma 2.1 of Park and Phillips (2001); however, it can be obtained 
by a construction along the same lines as Theorem Al of Hall and Heyde 
(1980). 
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It follows from (6.30) that, under the extended probability space, 

/ 1 n 1 n \ 

™ 1=2 



<rd n 



.33) 



1=2 



=- [jy^t/n) - V(r n , t ^/n)]Y: >t , I f> n f ) 

\l=2 1=2 / 



where, with c n = -y/n^/Zi, 



= T~ £[*W») " ^(rn,i-i/n)]if {cn[G°(t/n) - G° (i/n)]}. 



1-1 



i=l 



To establish our main result, we extend ^r=2[^( T «>*/ n ) ~~ V( T n,t-i/n)]Y* t 
to a continuous martingale. This can be done by defining 



j'-i 



(6.34) M n (r)=^F n * t 



1=2 



r 



7~n,t 
71 



r 



T n ,l-1 



n 



»:l 



V(r) - V 



T n,j-1 



n 



+ 



V(r)-V 



n 



for T nt j-i/n <r< T n j/n,j = 1,2, . . . ,n, and 

(6.35) M n {r) = jy: t y{^ -V(^j 

for r > T nn /n. It is readily seen that M n is a continuous martingale with 
quadratic variation process [M n ] given by 

(6.36) [M n ] r = £ Itf f ^ - ^) + F n *j 
for r nj _i/n < r < r nj /n, j = 1, 2, . . . , n, and 

n , 

(6.37) [M n ] r = YY% {-f 

t=2 ^ 



11 



H — ( r 

n 



for r > T nn /n. Similarly, the covariance process [M n , V] of M n and V is 
given by 

i-i 



ral 



Tn,t Tn,t-1 



71 



7? 



(6.38) [M n ,y] r = ^y T 

1=2 

for r n j_i/n < r < r nj /n, j = 1,2, . . . ,n, and 

(6.39) [JW B ,n- = £l£(^ 

1=2 ^ 



+ y* • r 



T n,j-1 



11 



T n ,t-1 \ _|_ 1 



for r > T ntn /n. 
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Write p n (t) = inf{s: [M n ] s > t}, a sequence of time changes. Note that 
[M n ]oo = oo for every n > 1 and 

(6.40) [M n , V] pn(t) -+ P as n ^ oo 

for every t G R, by (6.42) in Proposition 6.6 below. Theorem 2.3 of Revuz and 
Yor [(1999), page 524] yields that, if we call B n [i.e., B n (r) = M n {p n (r)}] 
the DDS Brownian motion [see, e.g., Revuz and Yor (1999), page 181] of the 
continuous martingale M n defined by (6.34) and (6.35), then B n converges 
in distribution to a Wiener process W. Since the law of the processes B n are 
all given by Wiener measure, it is plain that B n (r) =>■ W(r) (mixing), where 
the concept of mixing can be found in Hall and Heyde (1980), page 56. This, 
together with (6.43) in Proposition 6.7 below, yields that (B n (r), [M n ]i) => 
(W(r),r] 2 ), where W is independent of rj 2 = Lc(l,0), defined as in (3.8). 
Now, by noting that M n (l) is equal to B n ([M n ]i), the continuous mapping 
theorem implies that 

(6.41) (M n (l),[M n ] 1 )^ D (7 ? 7V,r ? 2 ), 

where N is a normal variate independent of n. 

By virtue of (6.33) and (6.41), the required result of the theorem fol- 
lows (6.44) and (6.45) in Proposition 6.7 and Proposition 6.8 below. 

It remains to show the following Propositions 6.6-6.8, whose proofs are 
given in the supplementary material [Wang and Phillips (2012)]. The proof 
of Theorem 3.3 under \uj\ < A is now complete. 

Proposition 6.6. In addition to Assumptions 1-3, assume that \uA < A, 

9 

nh — > oo and hlog n — > 0. Then, as n — > oo, 

(6.42) [M n , V] r ->0 in Probab. 
uniformly on r G [0, T], where T is an arbitrary given constant. 

Proposition 6.7. In addition to Assumptions 1-3, assume that \uj\ < A, 
nh? — > oo and nh 4 log 2 n — > 0. Under the extended probability space used 
in (6.29), we have 

(6.43) [Mnj^pn 2 , 
where rf = Lg(l,0) is defined as in (3.8), and 

1 n 

(6-44) [Mn]l __^y n f = 0p(1) . 

t=i 

Proposition 6.8. In addition to Assumptions 1-3, assume that \uj\ < A, 
nh 2 — > oo and nh 4 log 2 n — > 0. Then, 

n 

(6.45) M n (l)-^Y r 

i=2 



V 



'n,t 

n 



V 



T n ,t-1 



n 



o P (l) 
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We next remove the restriction \uj\ < A. To this end, let 

uij = Ujl(\uj\ < A/2) - E[ujl(\uj\ < A/2) | Tj-i], 
u 2j = Ujl(\uj\ > A/2) - E[ujl(\uj\ > A/2) \ Tj-i] 

and 

t-l t-i 
Y\ nt = ^ux ti+1 K[(xt - Xi)/h], Y 2n t = S ^u 2 ,i + iK[(x t -Xi)/h\. 

i=l i=l 

With this notation, we may write 

\ n 1 n 1 n 

-r~y] u t +iY nt = — ui jt+ iYi n t + — ui^+iY^n* + -^-X! n2 >*+ 1 ^* 

i=2 dn t=2 " n t=2 dn t=2 



(6.46) 

1 " 

Y n 1 n 2 n \ n 

32 X] Y nt = 12 X] + ^X] Y ±nt Y 2nt + Tfi X! *2nf 
a »t=2 ""(=2 «i=2 ™t=2 

(6.47) 

1 - 

: = w2 X^ + ^ 3n + ^ 4n ' 

fl ™ t=2 

Recall that |uij| < A, and u±j is a martingale difference satisfying 
E(u\ t I = E(u 2 t I(\u t \ < A) I 

- [E{u t I(\u t \ < A) I T t ^)] 2 
— ^ (T 2 a.s. 

as A — > 00. It follows from the proof of (3.7) under \uA < ^4 that, when 
n — > 00 first, and then ^4 — > 00, 

/ n 1 " \ 

( 6 .48) Y t ^i,t+iYinu-^J2 Y int ->u (^? 2 )- 

V " t=2 °»» t=2 / 

Now it is readily seen that the required result will follow if we prove 

(6.49) Ain^pO, i = 1,2,3,4, 

as n —7- 00 first, and then A — > 00. In fact, by virtue of (6.6) in Proposition 6.5, 

sup £u 2 < sup (Euf) l/4: < 00 

Ki<n Ki<n 
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and sup^ K(x) < oo, we have, for 1 < t < n, 

/t-2 x 
EY 2 t < 2 sup K(x)Eu\ + 2E\S^u i+x K\(x t - Xi)/h] 



\i=l / 
<C sup Euj{l + h 2 yft\ogt + hVt)<Cih^, 

l<i<n 

since h log re — > and n/i 2 — > oo . Similarly, 

EY? nt <C sup J5itfr(|ui| < ^(l + Zi^logt + Zi^) < Ci/iy 7 ^, 

l<?.<n 

EY int < C sup £u 2 I(|ii;| >A)(1 + h 2 Vilogt + hVi) < C x A~ 2 hsfn. 

l<i<n 

These results, together with the fact that u±j and U2j both are martingale 
difference satisfying 

sup£(^ >i+1 | T j) < sup[S(u4 | F0 2 < C, 

j j 

supE(ul J+1 | Tj) < sup E(u 2 I\ Uj \ >A | Tj) 
j j 

< A~ 2 sup E(u) \Tj)< CA~ 2 , 
j 

yield that, as n — > oo first, and then A — > oo, 

£A?„ < £ £Y 2 2 ni < CA~ 2 0, 

„ n. 

This proves (6.49), and hence the proof of Theorem 3.3 is complete. 

Acknowledgments. Our thanks to the Editor, Associate Editor and ref- 
erees for helpful comments on earlier versions. 

SUPPLEMENTARY MATERIAL 

Supplement to "A specification test for nonlinear nonstationary models" 

(DOI: 10.1214/12- AOS975SUPP; .pdf). Further details on the derivations 
in the present paper and supporting lemmas and proofs of the main results 
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on convergence to intersection local time are contained in the supplement 
to the paper, Wang and Phillips (2012). 

REFERENCES 

Akonom, J. (1993). Comportcment asymptotique du temps d'occupation du processus 
des sommes partielles. Ann. Inst. Henri Poincare Probab. Stat. 29 57-81. MR1204518 

Aldous, D. J. (1986). Self-intersections of 1-dimensional random walks. Probab. Theory 
Related Fields 72 559-587. MR0847386 

Bandi, F. M. and Phillips, P. C. B. (2003). Fully nonparametric estimation of scalar 
diffusion models. Econometnca 71 241-283. MR1956859 

Bandi, F. M. and Phillips, P. C. B. (2007). A simple approach to the parametric estima- 
tion of potentially nonstationary diffusions. J. Econometrics 137 354-395. MR2354949 

Berkes, I. and Horvath, L. (2006). Convergence of integral functionals of stochastic 
processes. Econometric Theory 22 304-322. MR2230391 

Borodin, A. N. and Ibragimov, I. A. (1994). Limit theorems for functionals of random 
walks. Tr. Mat. Inst. Steklova 195 286. MR1368394 

Buchmann, B. and Chan, N. H. (2007). Asymptotic theory of least squares estimators 
for nearly unstable processes under strong dependence. Ann. Statist. 35 2001-2017. 
MR2363961 

Chan, N. H. and Wei, C. Z. (1987). Asymptotic inference for nearly nonstationary AR(1) 

processes. Ann. Statist. 15 1050-1063. MR0902245 
Choi, I. and Saikkonen, P. (2004). Testing linearity in cointegrating smooth transition 

regressions. Econom. J. 7 341-365. MR2103419 
Choi, I. and Saikkonen, P. (2010). Tests for nonlinear cointegration. Econometric Theory 

26 682-709. MR2646476 
CSORGO, M. and Revesz, P. (1981). Strong Approximations in Probability and Statistics. 

Academic Press, New York. MR0666546 
DE Jong, R. and Wang, C.-H. (2005). Further results on the asymptotics for nonlinear 

transformations of integrated time series. Econometric Theory 21 413-430. MR2179544 
Gao, J. (2007). Nonlinear Time Series: Semiparametric and Nonparametric Methods. 

Monographs on Statistics and Applied Probability 108. Chapman & Hall/CRC, Boca 

Raton, FL. MR2297190 
Gao, J., King, M., Lu, Z. and TjOSTHEIM, D. (2009a). Nonparametric specification 

testing for nonlinear time series with nonstationarity. Econometric Theory 25 1869- 

1892. MR2557585 

Gao, J., King, M., Lu, Z. and TJ0STHEIM, D. (2009b). Specification testing in nonlinear 
and nonstationary time series autoregression. Ann. Statist. 37 3893-3928. MR2572447 

Hall, P. and Heyde, C. C. (1980). Martingale Limit Theory and Its Application. Aca- 
demic Press, New York. MR0624435 

Hong, S. H. and Phillips, P. C. B. (2010). Testing linearity in cointegrating relations 
with an application to purchasing power parity. J. Bus. Econom. Statist. 28 96-114. 
MR2650603 

Horowitz, J. L. and Spokoiny, V. G. (2001). An adaptive, rate-optimal test of a para- 
metric mean-regression model against a nonparametric alternative. Econometrica 69 
599-631. MR1828537 

Jeganathan, P. (2004). Convergence of functionals of sums of r.v.s to local times of 
fractional stable motions. Ann. Probab. 32 1771-1795. MR2073177 

Karlsen, H. A., Myklebust, T. and Tj0Stheim, D. (2007). Nonparametric estimation 
in a nonlinear cointegration type model. Ann. Statist. 35 252-299. MR2332276 



32 



Q. WANG AND P. C. B. PHILLIPS 



Karlsen, H. A. and TJ0STHEIM, D. (2001). Nonparametric estimation in null recurrent 

time series. Ann. Statist. 29 372-416. MR1863963 
Kasparis, I. and Phillips, P. C. B. (2012). Dynamic misspecification in nonparametric 

cointegrating regression. J. Econom. To appear. 
Li, Q. and Wang, S. (1998). A simple consistent bootstrap test for a parametric regression 

function. J. Econometrics 87 145-165. MR1648892 
Marmer, V. (2008). Nonlinearity, nonstationarity, and spurious forecasts. J. Economet- 
rics 142 1-27. MR2408730 
Park, J. Y. and Phillips, P. C. B. (1999). Asymptotics for nonlinear transformations 

of integrated time series. Econometric Theory 15 269-298. MR1704225 
Park, J. Y. and Phillips, P. C. B. (2001). Nonlinear regressions with integrated time 

series. Econometrica 69 117-161. MR1806536 
Phillips, P. C. B. (1987). Towards a unified asymptotic theory for autoregression. 

Bwmetnka 74 535-547. MR0909357 
Phillips, P. C. B. (1988). Regression theory for near-integrated time series. Econometrica 

56 1021-1043. MR0964147 
Phillips, P. C. B. (2009). Local limit theory and spurious nonparametric regression. 

Econometric Theory 25 1466-1497. MR2557571 
Phillips, P. C. B. and Park, J. Y. (1998). Nonstationary density estimation and kernel 

autoregression. Discussion Paper 1181, Cowles Foundation, Yale Univ. 
Phillips, P. C. B. and Solo, V. (1992). Asymptotics for linear processes. Ann. Statist. 

20 971-1001. MR1165602 
Revuz, D. and Yor, M. (1999). Continuous Martingales and Brownian Motion, 3rd ed. 

Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathe- 
matical Sciences] 293. Springer, Berlin. MR1725357 
Schienle, M. (2008). Nonparametric nonstationary regression. Doctoral thesis, 

Mannheim Univ. 

van der Hofstad, R., den Hollander, F. and Konig, W. (1997). Central limit theorem 

for the Edwards model. Ann. Probab. 25 573-597. MR1434119 
van der Hofstad, R., den Hollander, F. and Konig, W. (2003). Weak interaction 

limits for one-dimensional random polymers. Probab. Theory Related Fields 125 483- 

521. MR1974412 

van der Hofstad, R. and Konig, W. (2001). A survey of one-dimensional random 

polymers. J. Stat. Phys. 103 915-944. MR1851362 
Wang, Q. and Phillips, P. C. B. (2009a). Asymptotic theory for local time density 

estimation and nonparametric cointegrating regression. Econometric Theory 25 710- 

738. MR2507529 

Wang, Q. and Phillips, P. C. B. (2009b). Structural nonparametric cointegrating re- 
gression. Econometrica 77 1901-1948. MR2573873 

Wang, Q. and Phillips, P. C. B. (2011). Asymptotic theory for zero energy func- 
tionals with nonparametric regression applications. Econometric Theory 27 235-259. 
MR2782038 

Wang, Q. and Phillips, P. C. B. (2012). Supplement to "A specification test for non- 
linear nonstationary models." DOL10.1214/12-AOS975SUPP. 



School of Mathematics and Statistics 
University of Sydney 
NSW 2006 
Australia 

E-MAIL: qiying@maths.usyd.edu.au 



Department of Economics 

Yale University 

30 Hillhouse Avenue 

New Haven, Connecticut 06520 

USA 

E-MAIL: peter.phillips@yale.edu 



